178 research outputs found

    Engineering-Driven Learning Approaches for Bio-Manufacturing and Personalized Medicine

    Get PDF
    Healthcare problems have tremendous impact on human life. The past two decades have witnessed various biomedical research advances and clinical therapeutic effectiveness, including minimally invasive surgery, regenerative medicine, and immune therapy. However, the development of new treatment methods relies heavily on heuristic approaches and the experience of well-trained healthcare professionals. Therefore, it is often hindered by patient-specific genotypes and phenotypes, operator-dependent post-surgical outcomes, and exorbitant cost. Towards clinically effective and in-expensive treatments, this thesis develops analytics-based methodologies that integrate statistics, machine learning, and advanced manufacturing. Chapter 1 of my thesis introduces a novel function-on-function surrogate model with application to tissue-mimicking of 3D-printed medical prototypes. Using synthetic metamaterials to mimic biological tissue, 3D-printed medical prototypes are becoming increasingly important in improving surgery success rates. Here, the objective is to model mechanical response curves via functional metamaterial structures, and then conduct a tissue-mimicking optimization to find the best metamaterial structure. The proposed function-on-function surrogate model utilizes a Gaussian process for efficient emulation and optimization. For functional inputs, we propose a spectral-distance correlation function, which captures important spectral differences between two functional inputs. Dependencies for functional outputs are then modeled via a co-kriging framework. We further adopt shrinkage priors to learn and incorporate important physics. Finally, we demonstrate the effectiveness of the proposed emulator in a real-world study on heart surgery. Chapter 2 proposes an adaptive design method for experimentation under response censoring, often encountered in biomedical experiments. Censoring would result in a significant loss of information, and thereby a poor predictive model over an input domain. For such problems, experimental design is paramount for maximizing predictive power with a limited budget for expensive experimental runs. We propose an integrated censored mean-squared error (ICMSE) design method, which first estimates the posterior probability of a new observation being censored and then adaptively chooses design points that minimize predictive uncertainty under censoring. Adopting a Gaussian process model with product correlation functions, our ICMSE criterion has an easy-to-evaluate expression for efficient design optimization. We demonstrate the effectiveness of the ICMSE method in an application of medical device testing. Chapter 3 develops an active image synthesis method for efficient labeling (AISEL) to improve the learning performance in healthcare and medicine tasks. This is because the limited availability of data and the high costs of data collection are the key challenges when applying deep neural networks to healthcare applications. Our AISEL can generate a complementary dataset, with labels actively acquired to incorporate underlying physical knowledge at hand. AISEL framework first leverages a bidirectional generative invertible network (GIN) to extract interpretable features from training images and generate physically meaningful virtual ones. It then efficiently samples virtual images to exploit uncertain regions and explore the entire image space. We demonstrate the effectiveness of AISEL on a heart surgery study, where it lowers the labeling cost by 90% while achieving a 15% improvement in prediction accuracy. Chapter 4 presents a calibration-free statistical framework for the promising chimeric antigen receptor T cell therapy in fighting cancers. The objective is to effectively recover critical quality attributes under the intrinsic patient-to-patient variability, and therefore lower the cost of cell therapy. Our calibration-free approach models the patient-to-patient variability via a patient-specific calibration parameter. We adopt multiple biosensors to construct a patient-invariance statistic and alleviate the effect of the calibration parameter. Using the patient-invariance statistic, we can then recover the critical quality attribute during cell culture, free from the calibration parameter. In a T cell therapy study, our method effectively recovers viable cell concentration for cell culture monitoring and scale-up.Ph.D

    Plant Phenotyping on Mobile Devices

    Get PDF
    Plants phenotyping is a fast and non-destructive method to obtain the physiological features of plants, compared with the expensive and time costing chemical analysis with plant sampling. Through plant phenotyping, scientists and farmers can tell plant health status more accurately compared to visual inspection, thus avoid the waste in time and resources and even to predict the productivity. However, the size and price of current plant phenotyping equipment restrict them from being widely applied at a farmer’s household level. Everyday field operation is barely achieved because of the availability of easy-to-carry and cost-effective equipment such as hyper-spectrum cameras, infrared cameras and thermal cameras. A plant phenotyping tool on mobile devices will make plant phenotyping technology more accessible to ordinary farmers and researchers. This application incorporates the use of physical optics, plant science models, and image processing ability of smartphones. With our special optical design, multispectral instead of RGB (red, green and blue) images can be obtained from the smartphones with fairly low cost. Through quick image processing on the smartphones, the APP will provide accurate plant physiological features predictions such as water, chlorophyll, and nitrogen. The sophisticated prediction models are applied which are provided by the Purdue’s plant phenotyping team. Once widely adopted, the information collected by the smartphones with the developed APP will be sent back to Purdue’s plant health big-data database. The feedback will not only allow us to improve our models, but also provide farmers and agricultural researchers easy access to real-time crop plant health data

    Deep Generative Imputation Model for Missing Not At Random Data

    Full text link
    Data analysis usually suffers from the Missing Not At Random (MNAR) problem, where the cause of the value missing is not fully observed. Compared to the naive Missing Completely At Random (MCAR) problem, it is more in line with the realistic scenario whereas more complex and challenging. Existing statistical methods model the MNAR mechanism by different decomposition of the joint distribution of the complete data and the missing mask. But we empirically find that directly incorporating these statistical methods into deep generative models is sub-optimal. Specifically, it would neglect the confidence of the reconstructed mask during the MNAR imputation process, which leads to insufficient information extraction and less-guaranteed imputation quality. In this paper, we revisit the MNAR problem from a novel perspective that the complete data and missing mask are two modalities of incomplete data on an equal footing. Along with this line, we put forward a generative-model-specific joint probability decomposition method, conjunction model, to represent the distributions of two modalities in parallel and extract sufficient information from both complete data and missing mask. Taking a step further, we exploit a deep generative imputation model, namely GNR, to process the real-world missing mechanism in the latent space and concurrently impute the incomplete data and reconstruct the missing mask. The experimental results show that our GNR surpasses state-of-the-art MNAR baselines with significant margins (averagely improved from 9.9% to 18.8% in RMSE) and always gives a better mask reconstruction accuracy which makes the imputation more principle

    Project Florida: Federated Learning Made Easy

    Full text link
    We present Project Florida, a system architecture and software development kit (SDK) enabling deployment of large-scale Federated Learning (FL) solutions across a heterogeneous device ecosystem. Federated learning is an approach to machine learning based on a strong data sovereignty principle, i.e., that privacy and security of data is best enabled by storing it at its origin, whether on end-user devices or in segregated cloud storage silos. Federated learning enables model training across devices and silos while the training data remains within its security boundary, by distributing a model snapshot to a client running inside the boundary, running client code to update the model, and then aggregating updated snapshots across many clients in a central orchestrator. Deploying a FL solution requires implementation of complex privacy and security mechanisms as well as scalable orchestration infrastructure. Scale and performance is a paramount concern, as the model training process benefits from full participation of many client devices, which may have a wide variety of performance characteristics. Project Florida aims to simplify the task of deploying cross-device FL solutions by providing cloud-hosted infrastructure and accompanying task management interfaces, as well as a multi-platform SDK supporting most major programming languages including C++, Java, and Python, enabling FL training across a wide range of operating system (OS) and hardware specifications. The architecture decouples service management from the FL workflow, enabling a cloud service provider to deliver FL-as-a-service (FLaaS) to ML engineers and application developers. We present an overview of Florida, including a description of the architecture, sample code, and illustrative experiments demonstrating system capabilities
    • …
    corecore